anova kernel
Higher-Order Factorization Machines
Mathieu Blondel, Akinori Fujino, Naonori Ueda, Masakazu Ishihata
Factorization machines (FMs) are a supervised learning approach that can use second-order feature combinations even when the data is very high-dimensional. Unfortunately, despite increasing interest in FMs, there exists to date no efficient training algorithm for higher-order FMs (HOFMs). In this paper, we present the first generic yet efficient algorithms for training arbitrary-order HOFMs. We also present new variants of HOFMs with shared parameters, which greatly reduce model size and prediction times while maintaining similar accuracy. We demonstrate the proposed approaches on four different link prediction tasks.
Higher-Order Factorization Machines
Factorization machines (FMs) are a supervised learning approach that can use second-order feature combinations even when the data is very high-dimensional. Unfortunately, despite increasing interest in FMs, there exists to date no efficient training algorithm for higher-order FMs (HOFMs). In this paper, we present the first generic yet efficient algorithms for training arbitrary-order HOFMs. We also present new variants of HOFMs with shared parameters, which greatly reduce model size and prediction times while maintaining similar accuracy. We demonstrate the proposed approaches on four different link prediction tasks.
Learning in High-Dimensional Feature Spaces Using ANOVA-Based Fast Matrix-Vector Multiplication
Nestler, Franziska, Stoll, Martin, Wagner, Theresa
Kernel matrices are crucial in many learning tasks such as support vector machines or kernel ridge regression. The kernel matrix is typically dense and large-scale. Depending on the dimension of the feature space even the computation of all of its entries in reasonable time becomes a challenging task. For such dense matrices the cost of a matrix-vector product scales quadratically with the dimensionality N , if no customized methods are applied. We propose the use of an ANOVA kernel, where we construct several kernels based on lower-dimensional feature spaces for which we provide fast algorithms realizing the matrix-vector products. We employ the non-equispaced fast Fourier transform (NFFT), which is of linear complexity for fixed accuracy. Based on a feature grouping approach, we then show how the fast matrix-vector products can be embedded into a learning method choosing kernel ridge regression and the conjugate gradient solver. We illustrate the performance of our approach on several data sets.
Gaussian Quadrature for Kernel Features
Dao, Tri, Sa, Christopher M. De, Rรฉ, Christopher
Kernel methods have recently attracted resurgent interest, showing performance competitive with deep neural networks in tasks such as speech recognition. The random Fourier features map is a technique commonly used to scale up kernel machines, but employing the randomized feature map means that $O(\epsilon^{-2})$ samples are required to achieve an approximation error of at most $\epsilon$. We investigate some alternative schemes for constructing feature maps that are deterministic, rather than random, by approximating the kernel in the frequency domain using Gaussian quadrature. We show that deterministic feature maps can be constructed, for any $\gamma > 0$, to achieve error $\epsilon$ with $O(e^{e^\gamma} + \epsilon^{-1/\gamma})$ samples as $\epsilon$ goes to 0. Our method works particularly well with sparse ANOVA kernels, which are inspired by the convolutional layer of CNNs. We validate our methods on datasets in different domains, such as MNIST and TIMIT, showing that deterministic features are faster to generate and achieve accuracy comparable to the state-of-the-art kernel methods based on random Fourier features.
Higher-Order Factorization Machines
Blondel, Mathieu, Fujino, Akinori, Ueda, Naonori, Ishihata, Masakazu
Factorization machines (FMs) are a supervised learning approach that can use second-order feature combinations even when the data is very high-dimensional. Unfortunately, despite increasing interest in FMs, there exists to date no efficient training algorithm for higher-order FMs (HOFMs). In this paper, we present the first generic yet efficient algorithms for training arbitrary-order HOFMs. We also present new variants of HOFMs with shared parameters, which greatly reduce model size and prediction times while maintaining similar accuracy. We demonstrate the proposed approaches on four different link prediction tasks.
ANOVA kernels and RKHS of zero mean functions for model-based sensitivity analysis
Durrande, Nicolas, Ginsbourger, David, Roustant, Olivier, Carraro, Laurent
Given a reproducing kernel Hilbert space H of real-valued functions and a suitable measure mu over the source space D (subset of R), we decompose H as the sum of a subspace of centered functions for mu and its orthogonal in H. This decomposition leads to a special case of ANOVA kernels, for which the functional ANOVA representation of the best predictor can be elegantly derived, either in an interpolation or regularization framework. The proposed kernels appear to be particularly convenient for analyzing the e ffect of each (group of) variable(s) and computing sensitivity indices without recursivity.